home *** CD-ROM | disk | FTP | other *** search
- I've reconsidered my position on the "framing tags" in HTML after
- a more careful consideration of the SGML standard, and after
- receiving the O'Reilly/HaL DocBook materials and the MidasWWW
- browser.
-
- To refresh your memory...
-
- > Currently HTML documents are transmitted without the normal SGML framing
- > tags, but if these are included parsers will ignore them.
- >
- >I don't know what "the normal SGML framing tags" are. An SGML document
- >has three parts: the SGML declaration, the prologue, and the instance.
- >It is common in SGML applications to use an implied SGML declaration
- >and include the prologue by reference (kinda like an #include
- >directive in C.) but without these "framing tags," it's just not an
- >SGML document.
-
- The SGML standard is big on the distinction between Entities and
- everything else; that is, the physical breakup of an SGML document
- into storage units such as files, directories, MIME body parts,
- collectively "entities" is pretty much arbitrary (you can't break
- <TITLE> between <TI and TLE>,, but other than that,
- it's pretty much fair game.)
-
- So it appears that it's not necessary or even wise to model the HTML
- data format as an SGML document entity, but rather an SGML text
- entity. That is, the way to validate/parse an HTML document is not to
- sick the parser on the text/html body part itself, but on a document
- consisting of two entities: the HTML DTD entity, and the text/html
- body part.
-
- If we were talking about a text/c-program content type, what I
- was suggesting would be like putting the line:
-
- #include <stdlib.h>
-
- at the top of every text/c-program body part. What I'm suggesting
- now is like assuming every text/c-program gets stdlib.h prepended
- before compiling.
-
- This makes an assumption that text/html data has this HTML DTD
- entity in front of it all the time, but that assumption has always
- been there.
-
- Besides, forcing text/html parser to grok SGML document entities
- creates some sticky issues -- we'd have to limit the prologue
- to the simple <!DOCTYPE HTML SYSTEM>, and that's not really legal.
- You're supposed to be able to do things like:
-
- <!DOCTYPE HTML SYSTEM [
- <!ENTITY smiley ":-)"> <!-- add my own "macro" -->
- ]>
- <HTML><TITLE>The history of the smiley: &smiley;</TITLE>
- ...
-
- If we adopt this change of perspective, we should make it clear
- in the HTML specification that for the purposes of SGML, a text/html
- body part is not an SGML document entity, but an SGML text
- entity. The html.dtd entity and the text/html body part text entity
- comprise an SGML document. I need to update html.dtd, fix-html.pl,
- and the www_and_frame materials to reflect this change of
- perspective.
-
- By the way: this change makes it more staightforward to use an
- SGML declaration other than the default, e.g. to increase NAMELEN
- to allow tags larger than 8 characters. Should we do that while
- we're at it?
-
- Dan
-
- p.s. Check out the MidasWWW browser. It's long overdue in the WWW
- project, but it's worth the wait!
-
-